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Persistent fluctuations in the distribution of galaxies from the Two 
degree Field Galaxy Redshift Survey 



oo 

Si 

a 

O 



43 

Oh 
6 



> 
o 

(N 

m 
oo 

o 



Francesco Sylos Labini 1 ' 2 , Nikolay L. Vasilyev 3 and Yurij V. Baryshev 3 



Museo Storico della Fisica e Centra Studi e Ricerche Enrico Fermi, - Piazzale del 
Istituto dei Sistemi Complessi CNR, - Via dei Taurini 19, 00185 Rome, Italy 
Institute of Astronomy, St. Petersburg State University - Staryj Peterhoff, 198504, 



Viminale 1, 00184 Rome, Italy 
St. Petersburg, Russia 



PACS 98 . 80 . -k - Cosmology 

PACS 05 . 40 . -a - Fluctuations phenomena in random processes 

PACS 02.50.-r - Probability theory, stochastic processes, and statistics 

Abstract. - We apply the scale-length method to several three dimensional samples of the 
Two degree Field Galaxy Redshift Survey. This method allows us to map in a quantitative and 
powerful way large scale structures in the distribution of galaxies controlling systematic effects. 
By determining the probability density function of conditional fluctuations we show that large 
scale structures are quite typical and correspond to large fluctuations in the galaxy density field. 
We do not find a convergence to homogeneity up to the samples sizes, i.e. w 75 Mpc/h. We then 
measure, at scales r < 40 Mpc/h, a well defined and statistically stable power- law behavior of the 
average number of galaxies in spheres, with fractal dimension D — 2.2 ± 0.2. We point out that 
standard models of structure formation are unable to explain the existence of the large fluctuations 
in the galaxy density field detected in these samples. This conclusion is reached in two ways: by 
considering the scale, determined by the linear perturbation analysis of a self-gravitating fluid, 
below which large fluctuations are expected in standard models and through the determination of 
statistical properties of mock galaxy catalogs generated from cosmological N-body simulations of 
the Millenium consortitum 
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Introduction. — In the past twenty years observa- 
tions have provided growing evidences that galaxy distri- 
bution is organized in a complex network of structures and 
voids [1-4]. Despite the fact that large scale galaxy struc- 
tures, of size of the order of several hundreds of Mpc/h 1 , 
have been observed to be the typical feature of the distri- 
bution of visible matter in the local universe, the statis- 
tical analysis measuring their properties has identified a 
characteristic scale which has only slightly changed since 
its discovery fourthy years ago in angular catalogs. This 
scale, ro, was measured to be the one at which fluctuations 
in the galaxy density field are about twice the value of the 
sample density and it was indeed determined to be ro « 
5 Mpc/h in the Shane and Wirtanen angular catalog [5]. 
Subsequent measurements of this scale — see e.g. [6-13] - 
found a similar value, although in several samples larger 
values of r have been found (i.e. r ~ 6 — 12 Mpc/h). 
This variation was then ascribed to a luminosity depen- 



x We use Hq 
constant. 



lOO/i km/sec/Mpc for the value of the Hubble 



dent effect — see e.g. [7-9, 12]. 

However, recently in a CCD survey of bright galax- 
ies within the Northern and Southern strips of the 2dF 
Galaxy Redshift Survey (2dFGRS) [3] conclusive evi- 
dences where found that there are fluctuations of the order 
~ 30% in galaxy counts as a function of apparent magni- 
tude [14] (see also [15,16] for similar observations in other 
galaxy samples). Further since in the angular region to- 
ward the Southern galactic cap (SGC) a deficiency, with 
respect to the Northern galactic cap (NGC), in the counts 
below magnitude ~ 17 (in the B filter) was found, persist- 
ing over the full area of the APM and APMBGC catalogs, 
this would be an evidence that there is a large void of 
radius of about 150 Mpc/h implying that there are spa- 
tial correlations extending to scales larger than the scale 
detected by the 2dFGRS correlation function [10,11]. In- 
deed, by considering the two-point correlation function, 
and thus by normalizing the amplitude of fluctuations 
to the estimation of the sample density, the length-scale 
ro rs 6 — 8 Mpc/h was derived [10, 11]. 
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Structures and fluctuations at scales of the order of 100 
Mpc/h or more are at odds with the prediction of the 
concordance model of galaxy formation [14-16], while the 
small value of the correlation length is indeed compatible. 
In what follows we try to clarify this puzzling situation, 
i.e. the coexistence of the small typical length scales mea- 
sured by the two-point correlation function analysis with 
the large fluctuations in the galaxy density field on large 
scales as measured by the simple galaxy counts. Because 
of the difference in the counts amplitude, and thus in the 
sample density between the NGC and the SGC samples, 
the estimation of the sample density is not stable and thus 
one must critically consider the significance of the nor- 
malization of fluctuations amplitude to the estimation of 
the sample density as used in the correlation analysis em- 
ployed to measure the length scale tq. 

More generally the problem of the statistical character- 
ization these structures in a finite sample, of volume V 
containing M galaxies, can be rephrased as the problem 
of measuring volume averaged statistical quantities. The 
basic issue concerns whether these are meaningful descrip- 
tors, i.e. whether they give or not stable statistical esti- 
mations of ensemble averaged quantities [17]. In general 
it is assumed that galaxy distribution is an ergodic sta- 
tionary stochastic process [17], which means that it is sta- 
tistically translationally and rotationally invariant, thus 
avoiding special points or directions. Stationary stochas- 
tic distributions satisfy these conditions also when they 
have zero average density in the infinite volume limit [17]. 
The assumption of ergodicity implies that in a single re- 
alization of the microscopic number density field n(f) the 
average density no in the infinite volume is well defined 
and equal to the ensemble average density [17]. The con- 
stant no is strictly positive for homogeneous distributions 
and it is zero for infinite inhomogeneous ones [17]. The 
infinite volume limit must be considered in the definition 
of probabilistic properties, but in physical systems one is 
concerned only with finite volumes and statistical deter- 
minations. For inhomogeneous distributions, in a finite 
sample, the estimation of the average mass density gives a 
large relative error with respect to the ensemble value and 
it is thus systematically biased [17]. This situation occurs 
as long as the sample size is smaller than the scale Ao at 
which the distribution turns to homogeneity i.e. beyond 
which density fluctuations are small [17]. In the finite sam- 
ple analysis it is then necessary to study the conditional 
scaling properties of statistical quantities, by an analysis 
of fluctuations and correlations which explicitly considers 
whether a distribution can be or not homogeneous. Be- 
fore turning to the description of the methods employed 
to study galaxy distributions and mock galaxy samples we 
discuss the properties of the samples considered. 

The Data. - The Two degree Field Galaxy Red- 
shift Survey (2dFGRS) 2 [3] measured redshifts for more 
than 220, 000 galaxies in two strips in the southern galac- 

2 http : //www.mso . anu.edu.au/2dFGRS/ 



tic cap (SGC) and in northern galactic cap (NGC). The 
median redshift is z ~ 0.1 and the apparent magnitude 
corrected for galactic extinction in the bj filter is limited 
to 14.0 < bj < 19.45. The selection of the samples used in 
the analysis discussed below is described in [18]. To avoid 
the effect of the irregular edges of the survey we selected 
two rectangular regions whose limits are: for SGC 84° x 9° 
(-33° < 6 < -24°, -32° < a < 52°), and for NGC: 
60° x 6° (-4° < 6 < 2°, 150° < a < 210°). We construct 
Volume Limited (VL) samples, which are unbiased for the 
observational selection effects due to the limit in appar- 
ent magnitude [12]. To this aim we computed the metric 
distance R(z) with parameters CIm = 0.3 and fl\ = 0.7 
(i.e. the concordance model) and we determined absolute 
magnitudes M using K-corrections from [19]. Two cou- 
ples of VL samples, in each galactic cap, are identified by 
(i) 100 Mpc/h < R < 400 Mpc/h and -19.0 < M < -20.8 
(SGC400 and NGC400) and (ii) 150 Mpc/h < R < 550 
Mpc/h and -19.8 < M < -21.2 (SGC550 and NGC550). 
Each sample contains about N « 2 -f- 3 ■ 10 4 galaxies [18]. 

Statistical methods. — The scale-length (SL) anal- 
ysis [20] consists in the determination of the number 
N(r; Ri) of galaxies in spheres of radius r, centered on 
the i th galaxy at the radial distance Ri from the observer. 
When this is averaged over the whole sample it gives an 
estimate of the average conditional number of galaxies in 
spheres of radius r [17,20] 

1 M(r) 

where the sum is extended to the M(r) galaxies whose 
distance from the boundaries of the sample is smaller or 
equal to r. In this way when r growths M(r) decreases 
with r because only those galaxies for which the sphere is 
fully included in the sample volume arc considered as cen- 
ters [17]. In addition when r is large enough only a part 
of the sample is explored by the volume average [17,20]. 
Thus for large sphere radii M(r) decreases and the loca- 
tion of the galaxies contributing to the average in Eq. 1 is 
mostly at radial distance ~ [R m in + r, R max — r] from 
the radial boundaries of the sample at [R m in, Rmax]- By 
using these boundary conditions Eq.l gives the so-called 
full-shell estimator [17,21]. This has the advantage to 
make the weakest a-priori-assumptions about the proper- 
ties of the distribution outside the sample volume. Indeed 
one may use incomplete spheres, by counting the galax- 
ies inside a portion of a sphere and by weighting this for 
the corresponding volume [21]. However this method im- 
plicitly assumes that what is inside the incomplete sphere 
is a statistically meaningful estimate of the whole spheri- 
cal volume. This is incorrect when a distribution presents 
large fluctuations. For example in the part of a spheri- 
cal volume which lies outside the sample boundaries there 
can be an empty region or a large scale structure: in this 
situation the weighted estimate is biased [17]. In a finite 
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Fig. 1: Left panels: From top to bottom the SL analysis for the different 2dFGRS samples and r = 5, 10 Mpc/h (NGC400) and 
r = 20,30 Mpc/h (SGC550). Right panels: Probability density function f(N,r) of N(r;Ri) in the whole sample (thick solid 
line) and in two non-overlapping sub-samples with equal volume (each half of the sample volume) at small (think solid line) 
and large (dashed line) 7?. 
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sample, together with the average given by Eq.l, one may 

determine amplitudes of density fluctuations by measur- 

2 2 

ing their variance S(r) 2 = [N(r) 2 — N(r) ]/N(r) ~ f, 

where the last equality holds for inhomogeneous distri- 
butions (e.g. for a fractal N(r) ~ r D and D < 3) and it 
means that fluctuations are persistent [17]. In such a situa- 
tion, because, of the strong correlations, the Central Limit 
Theorem does not hold and the probability density func- 
tion (PDF) of fluctuations does not generally converge to 
a Gaussian function as for homogeneous ones [17], where 
1 [17]. 

The SL analysis of two 2dFGRS samples (Fig.l) shows 
large density fluctuations in the locations corresponding 
to large scale structures. Large scale structures trans- 
versely cross the NGC400 volume at about 250, 260, 290 
and 320 Mpc/h of thickness of about 30 Mpc/h. When 
the sphere radius r is increased from 5 to 10 Mpc the 
most prominent structure is the one at about 250 Mpc/h. 
This is due to the geometrical selection effect previously 
discussed. In the SGC550 sample the situation is similar 
to the NGC400 case, except for the fact that the radial 
distances corresponding to the large variations (i.e. struc- 
tures) in N(r;Ri) are different. We note that the same 
structures we observe in Fig.l have been also identified 
with different methods [22,23]. 

Thus, galaxy distribution in these samples are domi- 
nated by several large scale structures which cross their 
volumes. These structures are typical, i.e. they are de- 
tected at different radial distances and in two different 
sky areas of the 2dFGRS. They correspond to large den- 
sity fluctuations, i.e. large variations of N(r;Ri). For 
largest sphere radius we considered is r « 40 Mpc/h we 
find fluctuations of order four in N(r;Ri). This implies 
that A > 40 Mpc/h. 

Convergence to homogeneity ?. — We can now 

use the data obtained by the SL analysis to investigate 
whether there is a convergence to homogeneity at some 
large scales r > 40 Mpc/h. This is achieved by divid- 
ing the whole range of radial distances in bins of thick- 
ness AR centered at R, and computing in each bin the 
average N(r; R, AR) of N(r;R t ) at fixed r (Fig.2). To 
this aim we used small radii r = 5, 10 Mpc/h in order 
to avoid substantial overlap in space between neighboring 
bins in radial distance. We expect that, if the distribution 
converges to homogeneity at a scale Ao, correspondingly 
N(r; R, AR > Ao) does not show large fluctuations as a 
function of R. On the other hand for the largest radial 
bin chosen AR = 75 Mpc/h the measurements in bins 
centered at different R wildly scatters in the SGC and 
NGC samples, i.e. their values are outside the statistical 
error bars. These results show that structures, leading 
to persistent fluctuations up the largest scales sampled 
by this catalog, have an amplitude which is incompatible 
with homogeneity at scales Ao < AR = 75 Mpc/h. A 
smooth redshift-dependent correction (i.e. galaxy evolu- 
tion) would yield, at fixed R, to the same corrections in 
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Fig. 2: Upper panels: In the left panel it is shown the average 
number of points in spheres of radius r around a galaxy for 
the samples SGC400 and NGC400, while in the right panel 
for SGC550 and NGC550. A reference line with a slope 2.2 
is reported with the sample amplitude in both panels. The 
different amplitude in the two pairs of samples is ascribed to 
the different limits in absolute magnitude [17]. Bottom panels: 
In the left panel we show average value of N(r; Ri) with r = 10 
Mpc/h in bins of AT? = 50 Mpc/h while in the right panel the 
same for r = 10 Mpc/h and A_R = 75 Mpc/h. 

the SGC and NGC: thus the fluctuations detected cannot 
be an artifact of such an effect. 

The frequency distribution of conditional fluctuations 
gives an estimation of the PDF f(N, r): this is computed 
at fixed sphere radius r (Fig.l). In all cases f(N, r) is non 
Gaussian and statistically stable, i.e. it does not change 
when it is computed in the whole sample or in two non- 
overlapping sub-samples with equal volume at small and 
large radial distance. The tail of f(N, r) for large values 
of N is instead affected by the different fluctuations in 
the different sub-volumes. The trend is obvious: larger 
the fluctuations of N(r;Ri) more extended toward large 
N values is the tail of f(N, r). In each sub-sample, for the 
largest sphere radius r we find that f(N, r) is systemati- 
cally distorted with respect to smaller sphere radii. This 
is due to fact that, for large sphere radii, the volume aver- 
age cannot explore properly the full sample because of the 
geometrical selection effect present in the determination 
of N(r;Ri) and it is dominated by only few structures. 
The determination of the whole-sample average statistics, 
i.e. Eq.l, provides a meaningful statistical quantity as the 
PDF in all samples is reasonably statistically stable. We 
find N(r) oc r D with D = 2.2 ± 0.2 up to r ~ 40 Mpc/h 
(Fig.2) in agreement with previous determinations by [18]. 

Estimation of the standard two-point correlation 
function. — The estimator of the two-point correlation 
function can be written as [17] 



W) + 1 - dN ^ 1 
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where ns is the sample density. If N(r) oc r D then £(r) has 
the following features: (i) its amplitude is proportional to 
the sample size and (ii) it shows a break from a power-law 
at a scale of the order of the sample size. The amplitude 
of £(r) is then a ratio between a local and a global quan- 
tity (ns). The former one can be estimated for instance 
as ns = N/V. When V is spherical of radius R s we get 
that £(r ) = 1 for r = {D /Sf/^-D) Rs _ The geometry 
of the 2dFGRS samples is a spherical portion for which 
the radius of the maximum sphere fully enclosed is about 
R a ~ 40. Given that D sa 2, we get r « 10 Mpc/h, which 
is approximately the value obtained by [10,11]. The nor- 
malized mass variance is equal to unity at approximately 
the same scale [17]. A more detailed discussion of the de- 
termination of the two-point correlation function can be 
found in [18,33]. 

Comparisons with standard models of galaxy for- 
mation. — In cosmological structure formation cold dark 
matter models [25] gravitational collapse firstly forms non- 
linear structures (i.e. large fluctuations) at small scales 
and then larger and larger scales become non-linear. The 
theoretical homogeneity scale A™ identifying the range of 
distances where large inhomogeneities are formed, can be 
defined to such that the unconditional relative mass vari- 
ance in spheres is cr 2 (\™) = 1 [24]. Thus from the time 
dependence of the power spectrum in the linear pertur- 
bation analysis of the self-gravitating fluid equations it is 
possible to derive the time-dependence of A™(i) [24]. By 
normalizing the initial amplitude of density fluctuations 
to the Cosmic Microwave Background Radiation (CMBR) 
anisotropies it is found that at the present time, in the 
concordance model, « 10 Mpc/h [26,28,29]. 

This estimation is in agreement with results of cos- 
mological N-body simulations which are used to study 
the non-linear regime for r < A™. Here we considered 
the cosmological simulations performed by the Millen- 
nium project which are the largest ones performed until 
now [29]. Amount of dark matter and cosmological pa- 
rameters are given in agreement with the standard con- 
cordance models. Dark matter simulations have about 
10 10 particles and galaxies are identified according to semi- 
analytics models of galaxy formation [34] . We used a mock 
galaxy catalog with about 9 millions objects, where abso- 
lute magnitudes of mock galaxies can be transformed in 
the same filter bj of the 2dFGRS [34] . We have cut sam- 
ples with almost same geometry, number of objects and 
limits in magnitude and distance as the 2dFGRS samples. 
Note that we computed N(r; i?,) where R is the real-space 
position of each object with respect to the observer: in 
rcdshift space the dimension for r < 10 Mpc/h is slightly 
different [33] , We find (Fig.2) that N(r) ~ r 12 for r < 10 
Mpc/h and N(r) ~ r 3 for r > 10 Mpc/h. The PDF of 
conditional fluctuations rapidly converges to a Gaussian 
for r > 10 Mpc/h and it is statistically stable (Fig. 3). 
Correspondingly N(r;Ri) shows different and more quiet 
fluctuations than the real data. Note that at scales r > 10 
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Fig. 3: From top to bottom: The SL analysis for r = 10, 20 
Mpc/h (left) for the mock samples and their PDF (right). The 
dashed line is a the best fit with a Gaussian function. Fifth 
panel: Average number of points in spheres of radius r around 
a galaxy. The reference lines have slopes D = 1.2 (in real 
space) for r < 10 Mpc/h and D = 3 for r > 10 Mpc/h. Bottom 
panels: (left) average value of N(r; Ri) with r = 10 Mpc/h in 
bins of AR = 50 Mpc/h; (right) the same for r = 10 Mpc/h 
and A J? = 75 Mpc/h. 

Mpc/h density fluctuations in the dark matter field are in 
the linear regime and thus the understanding of biasing in 
that case is simple. Indeed, according to the simple thresh- 
old sampling a Gaussian field [27] biasing is linear when 
fluctuations are small and Gaussian [26] . In addition only 
non-local biasing mechanisms, which at the moment have 
not been explored in the literature, could possibly produce 
large scale density fluctuations of the kind observed in the 
galaxy distribution. 

Discussion. — In summary, by applying the SL 
method to the 2dFGRS samples we detect large density 
fluctuations of considerable spatial extension. At scales 
r < 40 Mpc/h we find statistically stable power-law cor- 
relations with fractal dimension D = 2.2 ± 0.2 in agree- 
ment with previous determinations [17,18,20,30-32]. For 
r > 40 Mpc/h we find that the galaxy distribution is 
strongly inhomogeneous and fluctuations are large up to 
the samples sizes, in agreement with a similar analysis of 
the SDSS data [20]. Persistent large scale density fluctu- 
ations are compatible [17] with fractal power-law correla- 
tions extending to scales r > 40 Mpc/h but incompati- 
ble with homogeneity at Aq < 75 Mpc/h. On the other 
hand, standard models of galaxy formation, normalized to 
CMBR anisotropies, predict AJJ 1 w 10 Mpc/h [24,29], i.e. 
smaller than our lower limit Ao > 75 Mpc/h. This pre- 
diction is in agreement with the results we found in mock 
galaxy catalogs where we measured that, fluctuations are 
more smoother than in the 2dFGRS samples, and their 
PDF rapidly converges to a Gaussian function for r > 10 
Mpc/h. 



F. Sylos Labini et al. 



Our results are in contrast with the standard determi- 
nations that the characteristic length scale of galaxy dis- 
tribution, marking the transition to the regime of small 
fluctuations, is of the order of 10 Mpc/h [11-13]. This 
is because this length scale is derived by measuring the 
amplitude of two-point correlation function When 
considering this quantity, which is normalized to the es- 
timation of the sample density, it is implicitly assumed 
that the distribution is homogeneous (i.e. with small am- 
plitude fluctuations) well inside the sample volume, i.e. 
Ao <C V 1 / 3 [17]. When fluctuations are large, as in the 
case of the 2dFGRS samples, this descriptor is system- 
atically biased by finite size effects [17, 18, 20] and so is 
the characteristic length scale derived from its amplitude. 
On the other hand our results fairly agree with studies 
of galaxy counts as a function of the apparent magnitude 
N(m), which indirectly probe radial distance fluctuations. 
These show large fluctuations around the average behav- 
ior: particularly N(m) in the SGC are down by 30% rela- 
tive to the NGC counts [14]. These behaviors can be now 
directly related to large scale galaxy structures and partic- 
ularly to the fact that in the NGC samples there are more 
structures, and thus an higher amplitude of N(r; R, Ar) 
(Fig. 2), than in SGC samples. 

Finally it is worth noticing that our results agree with 
the conclusion of [23] who found that large scale struc- 
tures (e.g. super-clusters) are more frequent in observed 
samples than in the simulations. 
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